Identifying the Most Suitable Stemmer for the CHiC Multilingual Ad-hoc Task
نویسندگان
چکیده
Because the 2013 Cultural Heritage in CLEF (CHiC) lab focused on multilingual retrieval, our goals were the integration of Apache Solr in our Xtrieval framework and the evaluation of different stemmers available for most of the relevant languages. As there were thirteen languages to cover, we tried to find a generic stemmer which works with all languages. We experimented with four setups, where one setup was without any stemmer, two setups used mainly rule-based stemmers and the last setup used a dictionary-based stemmer. For the dictionary-based stemmer we employed the HunSpell stemmer, which works with the same dictionaries as OpenOffice.
منابع مشابه
Cultural Heritage in CLEF (CHiC) 2013 - Multilingual Task Overview
The Cultural Heritage in CLEF 2013 multilingual task comprised two sub-tasks: multilingual ad-hoc retrieval and semantic enrichment. The multilingual ad-hoc retrieval sub-task evaluated retrieval experiments in 13 languages (Dutch, English, German, Greek, Finnish, French, Hungarian, Italian; Norwegian, Polish, Slovenian, Spanish, Swedish). More than 140,000 documents were assessed for relevance...
متن کاملCultural Heritage in CLEF (CHiC) 2013
The Cultural Heritage in CLEF 2013 lab comprised three tasks: multilingual ad-hoc retrieval and semantic enrichment in 13 languages (Dutch, English, German, Greek, Finnish, French, Hungarian, Italian, Norwegian, Polish, Slovenian, Spanish, and Swedish), Polish ad-hoc retrieval and the interactive task, which studied user behavior via log analysis and questionnaires. For the multilingual and Pol...
متن کاملCEA LIST's Participation at the CLEF CHiC 2013
For our first participation to the CLEF CHiC Lab, we submitted runs to the multilingual ad-hoc and multilingual semantic enrichment tasks. Given the strong multilingual character of the evaluation corpus, the main objectives of the experiments were to test the efficiency of semantic topic expansion and consolidation based on Explicit Semantic Analysis (ESA) versions in different languages. Anot...
متن کاملMultimedia Information Modeling and Retrieval (MRIM) /Laboratoire d'Informatique de Grenoble (LIG) at CHiC2013
Numerous cultural heritage materials are accessible through online digital library portals. However, this conversion resulted in the issues of inconsistency and incompleteness. The Cultural Heritage in CLEF 2013 (CHiC) takes the initiative to organize an evaluation campaign which involve several tasks such as 1) multilingual task, 2) polish task and 3) interactive task. We present the results o...
متن کاملThe Sheffield and Basque Country Universities Entry to CHiC: Using Random Walks and Similarity to Access Cultural Heritage
The Cultural Heritage in CLEF 2012 (CHiC) pilot evaluation included these tasks: ad-hoc retrieval, semantic enrichment and variability tasks. At CHiC 2012, the University of Sheffield and the University of the Basque Country submitted a joint entry, attempting the three English monolingual tasks. For the ad-hoc task, the baseline approach used the Indri Search engine. Query expansion approaches...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013